A Bayesian framework for knowledge driven regression model in micro-array data analysis
نویسندگان
چکیده
This paper addresses the sparse data problem in the linear regression model, namely the number of variables is significantly larger than the number of the data points for regression. We assume that in addition to the measured data points, the prior knowledge about the input variables may be provided in the form of pair wise similarity. We presented a full Bayesian framework to effectively exploit the similarity information of the input variables for linear regression. Empirical studies with gene expression data show that the regression errors can be reduced significantly by incorporating the similarity information derived from gene ontology.
منابع مشابه
A Bayesian Nominal Regression Model with Random Effects for Analysing Tehran Labor Force Survey Data
Large survey data are often accompanied by sampling weights that reflect the inequality probabilities for selecting samples in complex sampling. Sampling weights act as an expansion factor that, by scaling the subjects, turns the sample into a representative of the community. The quasi-maximum likelihood method is one of the approaches for considering sampling weights in the frequentist framewo...
متن کاملComparison of Maximum Likelihood Estimation and Bayesian with Generalized Gibbs Sampling for Ordinal Regression Analysis of Ovarian Hyperstimulation Syndrome
Background and Objectives: Analysis of ordinal data outcomes could lead to bias estimates and large variance in sparse one. The objective of this study is to compare parameter estimates of an ordinal regression model under maximum likelihood and Bayesian framework with generalized Gibbs sampling. The models were used to analyze ovarian hyperstimulation syndrome data. Methods: This study use...
متن کاملLearning Phenotype Specific Gene Network by Knowledge Driven Matrix Factorization
A popular method for reconstructing gene networks from micro-array data is Bayesian structure learning. However, most Bayesian structure learning algorithms suffer from three major shortcomings, i.e., the high computational cost, inefficiency in exploring qualitative knowledge, and inability of reconstructing phenotype specific gene network. We address these three short-comings by presenting a ...
متن کاملThe Analysis of Bayesian Probit Regression of Binary and Polychotomous Response Data
The goal of this study is to introduce a statistical method regarding the analysis of specific latent data for regression analysis of the discrete data and to build a relation between a probit regression model (related to the discrete response) and normal linear regression model (related to the latent data of continuous response). This method provides precise inferences on binary and multinomia...
متن کاملBayesian and Iterative Maximum Likelihood Estimation of the Coefficients in Logistic Regression Analysis with Linked Data
This paper considers logistic regression analysis with linked data. It is shown that, in logistic regression analysis with linked data, a finite mixture of Bernoulli distributions can be used for modeling the response variables. We proposed an iterative maximum likelihood estimator for the regression coefficients that takes the matching probabilities into account. Next, the Bayesian counterpart...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- International journal of data mining and bioinformatics
دوره 2 3 شماره
صفحات -
تاریخ انتشار 2008